DISTRIBUTION  STATEMENT  A.  Approved  for  public  release;  distribution  is  unlimited. 


Cetacean  Density  Estimation  from  Novel  Acoustic  Datasets  by 
Acoustic  Propagation  Modeling 

Martin  Siderius  and  Elizabeth  Thorp  Kusel 
Portland  State  University 
Electrical  and  Computer  Engineering  Department 
1900  SW  4th  Ave. 

Portland,  OR  97201 

phone:  (503)  725-3223  fax:  (503)  725-3807  email:  siderius@pdx.edu 

David  K.  Mellinger 
Oregon  State  University 

Cooperative  Institute  for  Marine  Resources  Studies 
2030  SE  Marine  Science  Dr. 

Newport,  OR  97365 

phone:  (541)  867-0372  fax:  (541)  867-3907  email:  David.Mellinger@oregonstate.edu 

Award  Number:  N00014-12-1-0207 
http://www.ece.pdx.edu/Faculty/Siderius.php 


LONG-TERM  GOALS 

This  project’s  long-tenn  goal  is  the  application  and  refinement  of  population  density  estimation 
methods  based  on  detections  of  marine  mammal  vocalizations  combined  with  propagation  modeling. 
The  density  estimation  method  is  applied  to  a  novel  acoustic  data  set,  collected  by  a  single 
hydrophone,  to  estimate  the  population  density  of  false  killer  whales  ( Pseudorca  crassidens)  off  of  the 
Kona  coast  of  the  Island  of  Hawai’i. 

OBJECTIVES 

The  objectives  of  this  research  are  to  apply  existing  methods  for  cetacean  density  estimation  from 
passive  acoustic  recordings  made  by  single  sensors,  to  novel  data  sets  and  cetacean  species,  as  well  as 
refine  the  existing  techniques  in  order  to  develop  a  more  generalized  model  that  can  be  applied  to 
many  species  in  different  environmental  scenarios.  The  chosen  study  area  is  well  suited  to  the 
development  of  techniques  that  incorporate  accurate  modeling  of  sound  propagation  due  to  the 
complexities  of  its  environment.  Moreover,  the  target  species  chosen  for  this  work,  the  false  killer 
whale,  suffers  from  interaction  with  the  fisheries  industry  and  its  population  has  been  reported  to  have 
declined  in  the  past  20  years.  Studies  of  abundance  estimate  of  false  killer  whales  in  Hawai’i  through 
mark  recapture  methods  will  provide  comparable  results  to  the  ones  obtained  by  this  project.  The 
ultimate  goal  is  to  contribute  to  the  development  of  population  density  estimation  methodologies  that 
will  be  readily  available  to  those  involved  in  marine  mammal  research,  monitoring,  and  mitigation. 
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APPROACH 


Approach  to  Estimating  Population  Density 

The  methodology  employed  in  this  study  to  estimate  the  population  density  of  false  killer  whales  off 
Kona,  Hawai’i,  is  based  on  the  works  of  Zimmer  et  al.  (2008),  Marques  et  al.  (2009),  and  Kusel  et  al. 
(2011).  The  density  estimator  fonnula  given  by  Marques  et  al.  (2009)  is  applied  here  for  the  case  of 
one  sensor,  yielding  the  following  formulation: 


D  - 


wc(l-c) 

nw^-PTr 


(1) 


In  equation  (1),  nc  corresponds  to  the  total  number  of  auto-detected  clicks  in  some  time  period  T.  The 
parameter  c  accounts  for  the  rate  of  false  positive  detections.  The  maximum  distance,  beyond  which 
we  don’t  expect  to  detect  any  calls,  is  given  by  w.  The  cue  production  rate  is  dependent  on  available 
studies  and  infonnation  on  animal  acoustic  behavior.  More  specifically,  a  cue  is  defined  in  this  study 
as  an  echolocation  click,  which  has  been  used  as  a  preferred  cue  type  for  density  estimation  studies 
from  single-sensor  data  sets.  Finally,  the  most  important  parameter  in  equation  (1)  for  our 
methodology  is  the  average  probability  of  detection,  P .  Because  detection  distances  are  not  realizable 
from  single-sensor  data,  the  average  detection  probability  is  estimated  in  a  Monte  Carlo  simulation 
using  the  sonar  equation  along  with  transmission  loss  calculations  to  estimate  the  received  signal-to- 
noise  ratio  (SNR)  of  tens  of  thousands  of  click  realizations.  In  the  Monte  Carlo  simulation,  clicks  are 
randomly  distributed  in  3D  space  inside  a  circular  area  of  radius  w  around  the  sensor  location. 
Simulated  SNRs  are  then  compared  to  those  measured  from  the  data  set  in  a  realization  of  the  detection 
function,  which  gives  a  probability  that  the  simulated  SNR  would  be  detected.  The  average  probability 
of  detection  from  all  Monte  Carlo  realizations  gives  P  to  be  used  in  equation  (1).  Finally,  by 
combining  the  total  number  of  detected  clicks,  the  proportion  of  false  positive  detections,  the  total  time 
of  data  analyzed,  and  the  average  click  production  rate  to  the  average  probability  of  detection  we  arrive 
at  an  estimate  of  the  population.  The  density  estimation  methodology  is  illustrated  in  Fig.  1 . 


Potential  Problems  in  the  Estimation  of  Detection  Probability 

Continuous-wave  (CW)  analysis,  that  is,  single-frequency  analysis,  is  inherent  to  basic  forms  of  the 
passive  sonar  equation.  In  the  analysis  detailed  above  it  is  typical  to  calculate  transmission  loss  only  at 
the  center  frequency  of  the  click.  This  is  then  used  to  estimate  received  SNRs.  However,  many 
echolocation  clicks  can  be  very  broadband  in  nature,  with  10-dB  bandwidths  of  20  to  40  kHz  or  more. 
Recently  Ainslie  (2013)  showed  by  means  of  analytical  fonnulations  that  considering  transmission  loss 
by  using  CW  analysis  with  the  click’s  center  frequency  while  disregarding  its  bandwidth  introduces 
bias  to  detection  probabilities  and  hence  to  population  density  estimates.  He  further  suggested  using  a 
broadband  correction  factor  in  the  passive  sonar  equation  to  avoid  errors  in  estimates  caused  by  huge 
call  bandwidths. 
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Figure  1.  Summary  of  single-sensor  population  density  estimation  methodology. 

In  view  of  the  potential  issue  described  above,  we  examine  the  methodology  that  has  been  used  to 
estimate  detection  probabilities  of  highly  broadband  clicks  recorded  by  single  instruments.  Using 
simple  modeling  experiments  based  on  synthetic  and  real  data  sets  that  have  highly  broadband  signals, 
we  quantify  the  bias  in  the  sonar  equation  estimates  of  detection  probability  and  its  effect  on  density 
estimates.  Furthermore,  we  discuss  the  usage  of  transmission  loss  as  an  appropriate  measure  for 
calculating  the  SNR  of  received  clicks,  as  well  as  the  usage  of  complex  propagation  models  that 
require,  most  often  nonexistent,  detailed  environmental  information.  Lastly  we  also  look  into  the 
effects  of  including  multipath  clicks  in  density  estimates. 

Population  Density  Estimation  of  False  Killer  Whales  off  Kona,  Hawaii 

A  test  case  using  a  real  data  set  containing  highly  broadband  false  killer  whale  ( Pseudorca  crassidens) 
clicks  recorded  off  the  Kona  coast  of  Hawai’i  was  used  to  further  investigate  the  single-sensor  density 
estimation  methodology.  In  this  case,  whale  acoustic  and  diving  behaviors  were  also  incorporated  into 
the  model.  From  literature  information  on  the  target  species’  diving  behavior  when  emitting  sounds,  a 
3D  random  distribution  of  simulated  animals  was  created  (Fig.  2),  taking  into  account  their  orientations 
with  respect  to  the  hydrophone.  The  simulated  animals  are  placed  inside  a  circle  in  which  the  center  is 
the  hydrophone  location  and  the  radius  corresponds  to  the  maximum  estimated  detection  distance  for 
false  killer  whale  clicks  in  the  local  environment,  which,  for  simulation  purposes,  is  taken  to  be  10  km. 
Source  level  is  taken  as  a  distribution  based  on  minimum  and  maximum  on-axis  values  reported  in  the 
literature  (Madsen  et  al.,  2004).  Information  on  directionality  loss  due  to  the  animal’s  beam  pattern  is 
also  taken  from  the  literature  (Au  et  al.,  1995).  Ambient  noise  levels  were  measured  from  the  acoustic 
data  set.  Transmission  loss  is  calculated  here  as  has  been  done  before  (Kiisel  et  al.,  2011),  that  is,  using 
an  acoustic  propagation  model,  and  also  by  calculating  arrival  times  and  amplitudes  and  convolving 
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with  a  false  killer  whale  source  click,  as  is  described  above  in  creating  a  synthetic  data  set.  Information 
on  source  level,  directionality  loss,  ambient  noise  and  transmission  loss  is  then  combined  in  the  sonar 
equation  to  estimate  SNRs  of  thousands  of  click  realizations.  The  remainder  of  the  analysis  follows 
that  described  above  in  the  section  Approach  to  Estimating  Population  Density. 
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Figure  2.  Kona  Coast  of  Hawai’i  showing  location  of  HARP  deployment  (white  dot  in  the  center  of 
semi-circle)  and  random  distribution  of 10,000  simulated  whale  locations  (red  dots)  around  the 
hydrophone  where  bathymetry  is  deep  enough  for  them  for  perform  foraging  dives  and  hence 

produce  echolocation  clicks. 


WORK  COMPLETED 

The  work  completed  in  2014  includes,  1)  Estimation  of  false  positive  detections  from  the  Hawai’i  data 
set,  2)  Design  and  execution  of  modeling  experiments  to  understand  the  effect  of  modeling  broadband 
calls  by  using  its  center  frequency  on  the  estimation  of  detection  probability. 

Outstanding  actions  for  this  project  include,  1)  Finishing  the  density  estimation  analysis  of  the  Hawai’i 
data  set  containing  false  killer  whale  echolocation  clicks,  in  view  of  the  results  from  the  modeling 
experiments.  This  entails,  running  Monte  Carlo  simulations  for  the  estimation  of  the  average 
probability  of  detection  (P  in  Eq.  (1)),  and  estimating  density  of  false  killer  whales  for  the  period  of  the 
data  set  being  used.  The  probability  of  detection  estimation  will  be  perfonned  by  using  the  clicks 
center  frequency,  and  also  by  taking  the  full  bandwidth  into  consideration.  Density  estimates  based  on 
these  approaches  will  be  compared. 

1)  Estimation  of false  positive  detections  from  the  Hawai’i  data  set 

The  proportion  of  false  positive  detections  was  estimated  by  manually  checking  every  30th  auto¬ 
detection  made  through  the  software  Ishmael  against  the  data  set.  During  the  manual  check,  auto¬ 
detections  of  reverberated  clicks,  which  often  spanned  several  detections,  were  considered  as  false 
positives.  Clicks  that  were  observed  to  be  part  of  a  buzz  sequence  were  also  treated  as  false  positives. 
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Finally,  clicks  that  looked  clipped,  that  is,  that  were  observed  across  the  entire  bandwidth  in  the 
spectrogram,  were  also  considered  false  positives. 

2)  Design  and  execution  of  modeling  experiments  for  population  density  estimation 
A  series  of  modeling  experiments  were  devised  with  increasing  degrees  of  complexity  to  examine  the 
effect  of  high  frequency  and  highly  broadband  calls  on  density  estimates.  All  experiments  were  based 
on  the  single-sensor  density  estimation  fonnula  (Eq.  (1)).  First,  a  simple  experiment  was  conducted 
where  1000  points  were  randomly  distributed  inside  circular  areas  of  radius  10  and  20  km.  Different 
detection  circles  were  assumed  inside  both  circular  areas,  and  were  defined  by  the  radius  where 
transmission  loss  equaled  noise  levels.  The  synthetic  data  was  created  using  a  single  frequency  and  the 
estimation  of  detection  probabilities  was  perfonned  by  using  a  higher  frequency  than  the  data.  The 
objective  was  to  show  that  by  using  a  different  frequency  in  the  calculations  than  that  of  the  original 
data,  the  detection  circle  would  change  and  density  estimates  would  be  consequently  under  or 
overestimated.  Usually  the  parameter  w  in  Eq.  (1)  can  be  taken  as  something  larger  than  the  expected 
detection  range.  By  taking  the  large  radius  of  20  km,  increasing  radius  of  detection  circles  are 
investigated  through  20  different  realizations  of  synthetic  data  and  the  effect  on  the  variance  of  density 
estimates. 

A  more  complex  synthetic  data  set  was  then  created  by  calculating  arrival  amplitudes  for  each  of  the 
100  points  randomly  distributed  inside  an  8  km  circular  area.  Arrival  amplitudes  were  convolved  with 
a  synthetic  and  highly  broadband  signal.  Realistic  ambient  noise  data  was  also  added  to  the  received 
signals.  Analysis  of  this  synthetic  data  set  and  its  modeling  was  performed  following  four  distinct 
cases.  Case  1  considered  a  5  kHz  bandwidth  centered  on  the  signal’s  center  frequency  (35  kHz)  and 
disregarded  all  multipath  arrivals.  Case  2  considered  the  full  bandwidth  of  the  signal  but  still 
disregarded  multipath  arrivals.  Case  3  considered  both  the  full  bandwidth  and  the  multipath  arrivals. 
Finally,  case  4  considered  the  5  kHz  narrow  band  around  the  center  frequency  and  multipath  arrivals. 
By  knowing  the  exact  number  of  points,  or  animals,  we  could  investigate  how  well  the  density 
estimator  performed  and  the  effect  of  choosing  different  frequency  bands  for  the  detection  and 
modeling  components  of  the  analysis. 

RESULTS 

1)  Results  on  the  estimation  of false  positive  detections  from  the  Hawaii  data  set 

Checking  every  30th  auto-detection  from  the  total  of  260,973  clicks  detected  in  2.5-hour  period  of 
continuous  data  being  analyzed,  yielded  a  rate  of  false  positive  detections  equivalent  to  30.84%.  This 
corresponds  to  parameter  c  in  Eq.  (1). 

2)  Results  on  the  modeling  experiments 

For  the  simple  modeling  experiment,  the  expected  probability  of  detection  is  given  by  the  ratio  of  the 
detection  area  by  the  total  area.  For  example,  considering  a  5  km  detection  radius  inside  of  a  20  km 
circle,  results  in  an  expected  detection  probability  of  0.0625.  The  total  number  of  animals  divided  by 
the  total  area  considered  gives  the  expected  density.  So,  for  the  case  of  1000  animals  inside  the  20  km 
circle  the  expected  density  is  0.7958.  We  observed  from  the  results  that  when  the  same  data  frequency 
was  used  in  the  calculations,  expected  and  simulated  probability  of  detection  and  density  estimate 
agreed  well.  On  the  other  hand,  by  using  a  higher  frequency  in  the  Monte  Carlo  simulation,  the 
probability  of  detection  is  underestimated  and  consequently  the  density  estimate  is  overestimated. 
These  results  can  be  better  visualized  through  Figs.  3  and  4.  The  synthetic  data  used  in  this  simple 
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example  is  shown  in  Fig.  3  along  with  the  detection  range  of  5  km  at  20  kHz  and  the  corresponding 
detection  circle  for  a  frequency  of  40  kHz.  By  increasing  the  frequency,  the  detection  radius  and 
consequently  the  number  of  animals  inside  the  circle  decreases. 


Synthetic  Data:  1000  Animals 


Figure  3.  Synthetic  data  created  for  simple  modeling  experiment  on  density  estimation 
with  1000  animals  uniformly  distributed  inside  a  circular  area  of  radius  20  km.  Detection 
circle  of  5  km  at  20  kHz  is  shown  as  the  blue  curve.  The  red  circle  represents  the  corresponding 

detection  circle  at  40  kHz. 

Figure  4  assumes  a  source  level  of  155  dB  re  1  pPA  /Hz.  Source  level  minus  transmission  loss  for 
sources  of  frequency  20  and  40  kHz  are  then  plotted  against  range.  Corresponding  noise  levels  at  the 
two  frequencies  considered  are  also  plotted  (straight  blue  and  red  lines)  and  the  distance  where  a 
detection  occurs  is  indicated  by  the  dashed  black  vertical  lines.  As  the  frequency  increases,  the 
detection  range  decreases.  For  the  case  of  1000  animals  inside  an  area  of  radius  20  km  and  detection 
circle  of  5  km  at  20  kHz,  by  estimating  density  using  a  40  kHz  frequency,  instead  of  approximately  0.8 
animals/km2,  the  calculated  density  is  approximately  3.5  animals/km2. 
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Figure  4.  Source  level  minus  transmission  loss  curves  assuming  a  source  level  of  155  dB  re  1 
gPA2/Hz  at  20  (blue  curve)  and  40  kHz  (red  curve).  Detection  distances  are  indicated  by 

black  dashed  lines. 

Finally,  taking  the  20  km  radial  area,  increasing  values  of  detection  circle  were  assumed  (2,  5,  10,  15 
and  18  km)  and  up  to  20  different  realizations  of  synthetic  data  were  assumed.  For  each  data 
realization  a  density  estimate  was  calculated  and  the  variance  of  all  estimates  was  taken.  It  was 
observed  that  the  closer  the  detection  range  was  from  the  actual  range  where  animals  were  considered 
the  better  the  density  estimate  with  lower  variance  given  all  realizations  of  the  synthetic  data.  Such 
result  suggests  that  the  parameter  w  in  Eq.  (1)  should  not  be  arbitrarily  big,  but  within  a  short  margin 
of  the  expected  detection  distance. 

The  analysis  process  on  the  complex  synthetic  data  set  was  the  same  as  would  be  done  with  measured 
data.  Results  of  the  density  estimation  calculations,  shown  in  Table  1,  indicate  good  agreement 
between  the  expected  density  estimate  and  the  calculated  estimate  using  a  narrow  bandwidth  (5  kHz) 
around  the  center  frequency  and  no  multipath  detections.  By  considering  the  full  bandwidth  of  the 
synthetic  signal  (10-60  kHz)  in  case  2  caused  the  density  estimate  to  increase.  Case  3,  which 
considered  both  full  bandwidth  and  multiple  arrivals,  yielded  a  density  estimate  that  was 
approximately  double  from  that  of  case  2.  A  close  look  at  transmission  loss  indicate  that  this  parameter 
could  be  calculated  using  the  simpler  spherical  spreading  law  plus  high  frequency  attenuation  for  the 
purpose  of  estimating  population  density.  Moreover,  each  detected  click  corresponds  to  one  arrival  and 
transmission  loss  is  the  sum  of  all  the  arrivals.  Therefore,  another  alternative  would  be  to  use  a  ray 
model  calculation  of  arrival  times  and  amplitudes  for  the  specific  environment  and  convolve  it  with  a 
source  “click”  in  the  same  manner  as  the  synthetic  data  set  was  created.  That  way,  the  full  spectrum  of 
the  call  would  be  taken  into  account  when  estimating  density,  and  multipath  could  also  be  taken  into 
account.  It  is  also  worth  noting  that  multiple  arrivals  can  be  taken  into  account  in  the  calculation  of  P . 

From  the  simple  modeling  examples  it  is  clear  that  the  P  estimate  should  be  consistent  with  detected 
clicks,  or  what  is  being  measured.  However,  real  measured  data  present  a  series  of  complexities  that 
also  need  to  be  taken  into  account  such  as  no  ground  truth  for  comparison,  clicks  are  also  distributed  in 
depth  and  usually  have  a  narrow  beam  pattern  that  needs  to  be  taken  into  account,  for  some 
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environments  reverberation  could  be  present  causing  clicks  to  be  non-distinctive  and  finally  the  click 
production  rate  might  not  be  known  for  the  species  of  interest. 


Table  I.  Results  of  density  estimation  analysis  on  synthetic  data  set  with  100  animals  uniformly 
distributed  inside  a  circular  area  of  radius  8  km.  Expected  density  is  497  animals/1000  km2. 
Transmission  loss  was  calculated  using  Bellhop  and  results  were  taken  at  600  m. 


nj*  P  D 

Case  1: 

-  32.5-37.5  kHz 

-  No  Multipath 

Case  2: 

-  10-60  kHz 

-  No  Multipath 

Case  3: 

-  10-60  kHz 

-  Multipath 

Case  4: 

-  32.5-37.5  kHz 

-  Multipath 


IMPACT/APPLICATIONS 

The  application  of  recently  developed  density  estimation  methods  to  different  data  sets  and  marine 
mammal  species  provides  opportunities  to  test  the  methodology  and  make  it  more  general.  It  was  noted 
however  that  such  methodology  is  not  a  “one  size  fits  all,”  since,  as  observed  in  the  present  study,  the 
frequency  band  of  calls  will  influence,  for  example,  how  to  appropriately  simulate  them.  When 
studying  species  that  are  considered  threatened  or  endangered  in  any  way,  as  is  the  case  with  false 
killer  whales  in  Hawai’i,  it  is  hoped  that  density  estimation  methods  from  passive  acoustics  can 
become  a  tool  to  help  monitor,  study  and  protect  those  populations.  Development  of  more  efficient  and 
accurate  propagation  modeling  practices,  by  performing  convergence  tests  and  propagating  the  field 
straight  to  each  simulated  animal  instead  of  performing  interpolation,  to  be  used  in  estimating  the 
probability  of  detection  of  marine  mammal  calls  is  also  an  interesting  component  of  this  project.  The 
ultimate  goal  is  to  develop  easy-to-use  software  to  make  density  estimation  readily  available  to  the 
Navy  and  to  those  involved  in  marine  mammal  research,  monitoring,  and  mitigation.  By  improving  our 
capabilities  for  monitoring  marine  mammals  we  hope  to  contribute  to  minimizing  and  mitigating  the 
impacts  of  man-made  activities  on  these  marine  organisms. 
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